Technique For Clustering Uncertain Data Based On Probability Distribution Similarity

نویسنده

Vandana Dubey

چکیده

: Clustering on uncertain data, one of the essential tasks in data mining. The traditional algorithms like K-Means clustering, UK Means clustering, density based clustering etc, to cluster uncertain data are limited to using geometric distance based similarity measures and cannot capture the difference between uncertain data with their distributions. Such methods cannot handle uncertain objects that are geometrically indistinguishable, such as products with the same mean but very different variances in customer ratings. In the case of K medoid clustering of uncertain data on the basis of their KL divergence similarity, they cluster the data based on their probability distribution similarity. Several methods have been proposed for the clustering of uncertain data. Some of these methods are reviewed. Compared to the traditional clustering methods, K-Medoid clustering algorithm based on KL divergence similarity is more efficient. First the probability distribution method for model uncertain data object then after that measure the similarity between data objects using distance metrics, then finally best clustering methods such as partition clustering, density based clustering. This paper proposes a new met for making the algorithm more effective with the consideration of effective with the consideration of initial selection of med

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering Multi-Attribute Uncertain Data using Probability Distribution

Clustering is an unsupervised classification technique for grouping set of abstract objects into classes of similar objects. Clustering uncertain data is one of the essential tasks in mining uncertain data. Uncertain data is typically found in the area of sensor networks, weather data, customer rating data etc. The earlier methods for clustering uncertain data based on probability distribution,...

متن کامل

A Review of Clustering Algorithms for Clustering Uncertain Data

Clustering is an important task in the Data Mining. Clustering on uncertain data is a challenging in both modeling similarity between objects of uncertain data and developing efficient computational method. The most of the previous method extends partitioning clustering methods and Density based clustering methods, which are based on geometrical distance between two objects. Such method cannot ...

متن کامل

Clustering on Uncertain Data using Kullback Leibler Divergence Measurement based on Probability Distribution

Cluster analysis is one of the important data analysis methods and is a very complex task. It is the art of a detecting group of similar objects in large data sets without requiring specified groups by means of explicit features or knowledge of data. Clustering on uncertain data is a most difficult task in both modeling similarity between uncertain data objects and developing efficient computat...

متن کامل

An Efficient Divergence and Distribution Based Similarity Measure for Clustering Of Uncertain Data

Data Mining is the extraction of hidden predictive information from large databases. Clustering is one of the popular data mining techniques. Clustering on uncertain data, one of the essential tasks in mining uncertain data, posts significant challenges on both modeling similarity between uncertain objects and developing efficient computational methods. The previous methods extend traditional p...

متن کامل

A Novel And Improved Technique For Clustering Uncertain Data

Clustering on uncertain data, one of the essential tasks in data mining. The traditional algorithms like K-Means clustering, UK Means clustering, density based clustering etc, to cluster uncertain data are limited to using geometric distance based similarity measures and cannot capture the difference between uncertain data with their distributions. Such methods cannot handle uncertain objects t...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Technique For Clustering Uncertain Data Based On Probability Distribution Similarity

نویسنده

چکیده

منابع مشابه

Clustering Multi-Attribute Uncertain Data using Probability Distribution

A Review of Clustering Algorithms for Clustering Uncertain Data

Clustering on Uncertain Data using Kullback Leibler Divergence Measurement based on Probability Distribution

An Efficient Divergence and Distribution Based Similarity Measure for Clustering Of Uncertain Data

A Novel And Improved Technique For Clustering Uncertain Data

عنوان ژورنال:

اشتراک گذاری